A Hybrid Data Cleaning Framework Using Markov Logic Networks
نویسندگان
چکیده
With the increase of dirty data, data cleaning turns into a crux analysis. The accuracy limitation existing integrity constraints-based approaches results from insufficient rules. In this paper, we present novel hybrid framework on top Markov logic networks (MLNs), termed as ${\sf MLNClean}$ , which is capable learning instantiated rules to supplement constraints. consists two steps, i.e., pre-processing and two-stage cleaning . pre-processing step, first infers set probable according MLNs then builds two-layer MLN index structure generate multiple versions facilitate process. two-stage presents concept reliability score clean errors within each version separately, afterward eliminates conflict values among different using fusion Considerable experimental both real synthetic scenarios demonstrate effectiveness in practice.
منابع مشابه
Hybrid Markov Logic Networks
Markov logic networks (MLNs) combine first-order logic and Markov networks, allowing us to handle the complexity and uncertainty of real-world problems in a single consistent framework. However, in MLNs all variables and features are discrete, while most real-world applications also contain continuous ones. In this paper we introduce hybrid MLNs, in which continuous properties (e.g., the distan...
متن کاملProbabilistic Abduction using Markov Logic Networks
Abduction is inference to the best explanation of a given set of evidence. It is important for plan or intent recognition systems. Traditional approaches to abductive reasoning have either used first-order logic, which is unable to reason under uncertainty, or Bayesian networks, which can handle uncertainty using probabilities but cannot directly handle an unbounded number of related entities. ...
متن کاملAnatomy Ontology Matching Using Markov Logic Networks
The anatomy of model species is described in ontologies, which are used to standardize the annotations of experimental data, such as gene expression patterns. To compare such data between species, we need to establish relationships between ontologies describing different species. Ontology matching is a kind of solutions to find semantic correspondences between entities of different ontologies. ...
متن کاملLearning Markov Logic Networks Using Structural Motifs
Markov logic networks (MLNs) use firstorder formulas to define features of Markov networks. Current MLN structure learners can only learn short clauses (4-5 literals) due to extreme computational costs, and thus are unable to represent complex regularities in data. To address this problem, we present LSM, the first MLN structure learner capable of efficiently and accurately learning long clause...
متن کاملThe LLUNATIC Data-Cleaning Framework
Data-cleaning (or data-repairing) is considered a crucial problem in many database-related tasks. It consists in making a database consistent with respect to a set of given constraints. In recent years, repairing methods have been proposed for several classes of constraints. However, these methods rely on ad hoc decisions and tend to hard-code the strategy to repair conflicting values. As a con...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering
سال: 2022
ISSN: ['1558-2191', '1041-4347', '2326-3865']
DOI: https://doi.org/10.1109/tkde.2020.3012472